- 
                Notifications
    
You must be signed in to change notification settings  - Fork 277
 
feat(router): add intent-aware LoRA routing support #579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add LoRAAdapter struct to define available LoRA adapters per model - Add lora_name field to ModelScore for specifying LoRA adapter - Implement validation to ensure lora_name references defined LoRAs - Update model selection logic to use LoRA name when specified - Add comprehensive example configuration and documentation - Update README to reflect LoRA adapter routing capability This enables semantic router to route requests to different LoRA adapters based on classified intent/category, allowing domain-specific fine-tuned models to be selected automatically. Fixes: #545 Signed-off-by: bitliu <[email protected]>
          ✅ Deploy Preview for vllm-semantic-router ready!
 To edit notification comments on pull requests, go to your Netlify project configuration.  | 
    
          👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁 
 | 
    
Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
Add hybrid-cache.md to the Semantic Cache section in sidebar. Signed-off-by: bitliu <[email protected]>
c67e031    to
    6127faa      
    Compare
  
    Fix broken link from getting-started/quickstart.md to installation/installation.md Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for intent-aware LoRA (Low-Rank Adaptation) routing in the Semantic Router, enabling automatic selection of domain-specific LoRA adapters based on classified query intent.
Key changes:
- Added LoRA adapter configuration support in model definitions with validation
 - Implemented automatic LoRA name substitution in model selection logic
 - Added comprehensive tutorial and configuration documentation
 
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description | 
|---|---|
| website/sidebars.ts | Added new tutorial entries for LoRA routing and hybrid cache | 
| website/docs/tutorials/intelligent-route/lora-routing.md | New tutorial documenting LoRA routing setup and configuration | 
| website/docs/overview/categories/configuration.md | Extended documentation with LoRA adapter configuration details | 
| src/semantic-router/pkg/config/config.go | Added LoRA adapter data structures to support LoRA configuration | 
| src/semantic-router/pkg/config/validator.go | Added validation logic to ensure LoRA names reference defined adapters | 
| src/semantic-router/pkg/classification/classifier.go | Implemented LoRA name substitution in model selection | 
| config/intelligent-routing/in-tree/lora_routing.yaml | Added complete example configuration demonstrating LoRA routing | 
| README.md | Updated feature description and removed distributed tracing section | 
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]> Signed-off-by: Xunzhuo <[email protected]>
Co-authored-by: Copilot <[email protected]> Signed-off-by: Xunzhuo <[email protected]>

What type of PR is this?
Feature enhancement for intelligent routing system
What this PR does / why we need it:
This PR implements intent-aware LoRA (Low-Rank Adaptation) routing support, enabling the semantic router to automatically select different LoRA adapters based on the classified intent/category of incoming requests.
Key Changes:
Configuration Structure:
LoRAAdapterstruct to define available LoRA adapters per model inmodel_configlora_namefield toModelScorefor specifying which LoRA adapter to usemodel_configbefore being referencedValidation:
validateLoRAName()to ensurelora_namereferences a valid LoRA defined in the model's configurationModel Selection Logic:
selectBestModelInternal()to use LoRA name as the final model name when specifiedGetModelsForCategory()to return LoRA names for proper filteringDocumentation & Examples:
config/intelligent-routing/in-tree/lora_routing_example.yamlwebsite/docs/overview/categories/configuration.mdConfiguration Example:
How It Works:
model_configunder the base modelModelScorefor that categorylora_nameis defined in model'sloraslistlora_nameis specified, it replaces the base model namemodel="technical-lora"Benefits:
Prerequisites:
--enable-loraflag--lora-modulesparameterFixes: #545